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Abstract 

We present MUSE, a software framework for combining existing computational 
tools for different astrophysical domains into a single multiphysics, multiscale appli- 
cation. MUSE facilitates the coupling of existing codes written in different languages 
by providing inter-language tools and by specifying an interface between each mod- 
ule and the framework that represents a balance between generality and computa- 
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tional efficiency. This approach allows scientists to use combinations of codes to solve 
highly-coupled problems without the need to write new codes for other domains or 
significantly alter their existing codes. MUSE currently incorporates the domains of 
stellar dynamics, stellar evolution and stellar hydrodynamics for studying general- 
ized stellar systems. We have now reached a "Noah's Ark" milestone, with (at least) 
two available numerical solvers for each domain. MUSE can treat multi-scale and 
multi-physics systems in which the time- and size-scales are well separated, like sim- 
ulating the evolution of planetary systems, small stellar associations, dense stellar 
clusters, galaxies and galactic nuclei. In this paper we describe three examples cal- 
culated using MUSE: the merger of two galaxies, the merger of two evolving stars, 
and a hybrid iV-body simulation. In addition, we demonstrate an implementation of 
MUSE on a distributed computer which may also include special-purpose hardware, 
such as GRAPEs or GPUs, to accelerate computations. The current MUSE code 
base is publicly available as open source at http://muse.li. 



1 Introduction 



The Universe is a multi-physics environment in which, from an astrophysical 
point of view, Newton's gravitational force law, radiative processes, nuclear 
reactions and hydrodynamics mutually interact. The astrophysical problems 
which are relevant to this study generally are multi-scale, with spatial and 
temporal scales ranging from 10 4 m and 10 -3 seconds on the small end to 
10 20 m and 10 17 s on the large end. The combined multi-physics, multi-scale 
environment presents a tremendous theoretical challenge for modern science. 
While observational astronomy fills important gaps in our knowledge by har- 
vesting ever-wider spectral coverage with continuously increasing resolution 
and sensitivity, our theoretical understanding lags behind these exciting de- 
velopments in instrumentation. 

In many ways, computational astrophysics lies intermediate between observa- 
tions and theory. Simulations generally cover a wider range of physical phe- 
nomena than observations with individual telescopes, whereas purely theoreti- 
cal studies are often tailored to solving sets of linearized differential equations. 
As soon as these equations show emergent behavior in which the mutual cou- 
pling of non-linear processes result in complex behavior, the computer provides 
an enormous resource for addressing these problems. Furthermore simulations 
can support observational astronomy by mimicking observations and aiding 
their interpretation by enabling parameter-space studies. They can elucidate 
the often complex consequences of even simple physical theories, like the non- 
linear behavior of many-body gravitational systems. But in order to deepen 
our knowledge of the physics, extensive computer simulations require large 
programming efforts and a thorough fundamental understanding of many as- 
pects of the underlying theory. 



From a management perspective, the design of a typical simulation package 
differs from construction of a telescope in one very important respect. Whereas 
modern astronomical instrumentation is generally built by teams of tens or 
hundreds of people, theoretical models are usually one-person endeavors. Pure 
theory lends itself well to this relatively individualistic approach, but scientific 
computing is in a less favorable position. So long as the physical scope of a 
problem remains relatively limited, the necessary software can be built and 
maintained by a single numerically educated astronomer or scientific program- 
mer. However, these programs are often "single-author, single-use" , and thus 
single-purpose: recycling of scientific software within astronomy is still rare. 

More complex computer models often entail non-linear couplings between 
many distinct, and traditionally separate, physical domains. Developing a sim- 
ulation environment suitable for multi-physics scientific research is not a sim- 
ple task. Problems which encompass multiple temporal or spatial scales are 
often coded by small teams of astronomers. Many recent suc cessful projects 



have been carried o ut in this way; examp l es are GADGET (ISpringel et al. 



200ll ). and starlab (IPortegies Zwart et all 1200 ll ). In all these cases, a team 



of scientists collaborated in writing a large-scale simulation environment. The 
resulting software has a broad user base, and has been applied to a wide va- 
riety of problems. These packages, however, each address one quite specific 
task, and their use is limited to a rather narrow physical domain. In addition, 
considerable expertise is needed to use them and expanding these packages 
to allow the addition of a new physical domain is hampered by early design 
choices. 

In this paper we describe a software framework that targets multi-scale, multi- 
physics problems in a hierarchical and internally consistent implementation. 
Its development is based on the philosophy of "open knowledge" [3 We call 
this environment MUSE, for MUltiphysics Software Environment. 



2 The concept of MUSE 



The developme nt of MUSE began during the MODEST-6al_i| workshop in 



Lund, Sweden ( iDavies et all 120061 ) . but the first lines of code were written 
during MODEST-6d/e in Amsterdam (the Netherlands). The idea of Noah's 
Ark (see §2.1) was conceived and realized in 2007, during the MODEST- 7f 



1 See for example prttp: //www. artcompsci .org/ok/. 

2 MODEST stands for MOdeling DEnse STellar Systems; the term was coined 
during the first MODEST meeting in New York (USA ] in 2001. The MODEST web 
page is [http : //www . manybody . org/modest; see also iHut et al.l ( 20031 ): ISills et al. 
(|2003h . 
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Fig. 1. Basic structure design of the framework (MUSE). The top layer (flow con- 
trol) is connected to the middle (interface layer) which controls the command struc- 
ture for the individual applications. In this example only an indicative selection of 
numerical techniques is shown for each of the applications. 

workshop in Amsterdam and the MODEST- 7a meeting in Split (Croatia). 

The development of a multi-physics simulation environment can be approached 
from a monolithic or from a modular point of view. In the monolithic approach 
a single numerical solver is systematically expanded to include more physics. 
Basic design choices for the initial numerical solver are generally petrified in 
the initial architecture. Nevertheless, such codes are sometimes successfully 
redesigned to include two or possibly even three solvers for a different physi- 
cal phenomeno n (see FLASH , where hydrodynamics has been combined with 
magnetic fields iFryxell et all l2000h ). Rather than forming a self-consistent 
framework, the different physical domains in these environments are made to 
co-exist. This approach is prone to errors, and the resulting large simulation 
packages often suffer from bugs, redundancy in source code, sections of dead 
code, and a lack of homogeneity. The assumptions needed to make these codes 
compile and operate without fatal errors often hampers the science. In addi- 
tion, the underlying assumptions are rarely documented, and the resulting 
science is often hard to interpret. 



We address these issues in MUSE by the development of a modular numerical 
environment, in which independently developed specialized numerical solvers 
are coupled at a meta level, resulting in the coherent framework as depicted 



in Fig. 1. Modules are designed with well defined interfaces governing their 
interaction with the rest of the system. Scheduling of, and communication 
among modules is managed by a top-level "glue" language. In the case of 
MUSE, this glue language is Python, chosen for its rich feature set, ease of 
programming, object-oriented capabilities, large user base, and extensive user- 
written software libraries. However, we have the feeling that Python is not 
always consistent and of equally high quality in all places. The objective of 
the glue code is to realize the interoperation between different parts of the 
code, which may be realized via object-relational mapping, in which individual 
modules are equipped with instruction sets to exchange information with other 
modules. 

The modular approach has many advantages. Existing codes which have been 
well tuned and tested within their own domains can be reused by wrapping 
them in a thin interface layer and incorporating them into a larger framework. 
The identification and specification of suitable interfaces for such codes allows 
them to be interchanged easily. An important element of this design is the 
provision of documentation and exemplars for the design of new modules and 
for their integration into the framework. A user can "mix and match" modules 
like building blocks to find the most suitable combination for the application 
at hand, or to compare them side by side. The first inte rface standard be - 



tween stellar evolution and stellar dynamics goes back to Ifiut et al.l (120031 ). 
The resulting software is also more easily maintainable, since all dependencies 
between a module and the rest of the system are well defined and documented. 

A particular advantage of a modular framework is that a motivated scholar 
can focus attention on a narrower area, write a module for it, then integrate it 
into the framework with knowledge of only the bare essentials of the interface. 
For example, it would take little extra work to adapt the results of a successful 
student project into a separate module, or for a researcher working with a code 
in one field of physics to find out how the code interacts in a multi-physics 
environment. The shallower learning curve of the framework significantly low- 
ers the barrier for entry, making it more accessible and ultimately leading to 
a more open and extensible system. 

The only constraint placed on a new module is that it must be written (or 
wrapped) in a programming language with a Foreign Function Interface that 
can be linked to a contemporary Unix-like system. As in the high-level lan- 
guage Haskell, a Foreign Function Interface provide a mechanism by which 
a program written in one language can call routines from another language. 
Supported languages include low-level (C, C++ and Fortran) as well as other 
high-level languages such as C#, Java, Haskell, Python and Ruby. Currently, 
individual MUSE modules are written in Fortran, C, and C++, and are inter- 
faced with Python using f 2py or swig. Several interfaces are written almost 
entirely in Python, minimizing the programming burden on the legacy pro- 



grammer. The flexibility of the framework allows a much broader range of 
applications to be prototyped, and the bottom-up approach makes the code 
much easier to understand, expand and maintain. If a particular combination 
of modules is found to be particularly suited to an application, greater effi- 
ciency can be achieved by hard coding the interfaces and factoring out the 
glue code, thus effectively ramping up to a specialized monolithic code. 



2.1 Noah's Ark 



Instead of writing all new code from scratch, in MUSE we realized a software 
framework in which the glue language provides an object-relational mapping 
via a virtual library which is used to bind a wide collection of diverse appli- 
cations. 

MUSE consists of a hierarchical component architecture that encapsulates 
dynamic shared libraries for simulating stellar evolution, stellar dynamics and 
treatments for colliding stars. As part of the MUSE specification, each module 
manages its own internal (application-specific) data, communicating through 
the interface only the minimum information needed for it to interoperate with 
the rest of the system. Additional packages for file I/O, data analysis and 
plotting are included. Our objective is eventually to incorporate treatments 
of gas dynamics and radiative transfer, but at this point these are not yet 
implemented. 

We have so far included at least two working packages for each of the domains 
of stellar collisions, stellar evolution and stellar dynamics, in what we call the 
"Noah's Ark" milestone. The homogeneous interface that connects the kernel 
modules enables us to switch packages at runtime via the scheduler. The goal 
of this paper is to demonstrate the modularity and interchangeability of the 
MUSE framework. In Tab. 1 we give an overview of the currently available 
modules in MUSE. 



2.1.1 Stellar dynamics 

To simulate gravitational dynamics (e.g. between stars and/or planets), we 
incorporate packages to solve Newton's equations of motion by means of grav- 
itational iV-body solvers. Several distinct classes of iV-body kernels are cur- 
rently available. These are based on the direct force evaluation methods or 
tree codes. 

Currently four direct iV-body methods are incorporated, all of which are 
based on the fourt h-order Hermite predictor- corrector iV-body integrator, with 



block time steps (IMakino fc Aarsethl . Il992l ). Some of them can benefit from 



Table 1 

Modules currently present (or in preparation) in MUSE. The codes are iden- 
tified by their acronym, which is also used on the MUSE repository at 
http : //muse 7Ti| followed by a short description. Some of the modules men- 
tioned here are used in S 3. Citations to the literature are indicated in the sec- 



ond co lu mn by their index llEggleton et a l. (1989), 2.Eg gletonl (120061'). 3lHut et al.l 

Jn 



(1995), 4 M akino k Aarsethl (ll992fl. 5JHarfst et al.J (120071). 6iBarnes fc Hutj (1986), 

2003); 



7 lLombardi et al.l (120031 1: s fevcerz et al.l d2008bl ]ah: fl lFregeau et al.l (12002 



Fuiii et al.l ( 20071 ). For a number of modules the source code is currently not 



10 

available within MUSE because they are not publicly available o r still under de- 
velop ment. Those are the Henyey stellar evoluti on code EVTwin (Eggletonl. 11971 



2006), the Monte-Carlo dynamics module cmc (jJoshi e t all 120001 : iFregeau et al 



2003), the hybrid iV-body integrator BRIDGE (JFuiii et all 120071 . used in §3.3) and 



)0; 



the Monte-Carlo radiative transfer code MCRT. 



MUSE module 



ref. language brief description 



EFT89 


1 


C 


EVTwin 


2 


F77/F90 


HermiteO 


3 


C++ 


NBODYlh 


4 


F77 


phiGRAPE 


5 


F77 


BHTree 


6 


C++ 


SmallN 




C++ 


Sticky Spheres 




C++ 


mmas 


7 


F77 


MCRT 




C++ 


Globus support 




Python 


HLA grid support 


8 


HLA 


Scheduler 




Python 


Unit module 




Python 


XML parser 




Python 



Parameterized stellar evolution 

Henyey code to evolve stars 

Direct A^-body integrator 

Direct A^-body integrator 

(parallel) direct A^-body integrator 

Barnes-Hut tree code 

Direct integrator for systems of few bodies 

Angular momentum and energy conserving collision treatment 

Entropy sorting for merging two stellar structures 

Monte- Carlo Radiative Transfer 



Support for performing simulations on distributed resources 

Support for performing simulations on distributed resources 

Schedules the calling sequence between modules 

Unit conversion 

Primitive parser for XML formatted data 



cmc 9 C Monte Carlo stellar dynamics module 

BRIDGE 10 C++ Hybrid direct A^-body with Barnes-Hut Tree code 



speci al-purpose h ardware such as GRAPE (IMakino fc T aiii. 



20011 ) or a GPU (IPorteries Zwart et all 120071 : iBelleman et al. 



1998; Making 



20081 ). Direct 



methods provides the high accuracy needed for simulating dense stellar sys- 
tems, but even with special computer hardware they lack the performance to 
simulate systems with more than ~ 10 6 particles. The Barnes-Hut tree-codes 
( IBarnes &: Hutl . Il986l ) are included for use in simulations of large- N systems. 
Two of the four codes are also GRAPE/GPU-enabled. 



2.1.2 Stellar evolution 

Many applications require the structure and evolution of stars to be followed 
at various levels of detail. At a minimum, the dynamical modules need to 
know stellar masses and radii as functions of time, since these quantities feed 
back into the dynamical evolution. At greater levels of realism, stellar tem- 
peratures and luminosities (for basic comparison with observations), photon 
energy distributions (for feedback on radiative transfer), mass loss rates, out- 
flow velocities and yields of various chemical elements (returned to the gas in 
the system) , and even the detailed interior structure (to follow the outcome of 
a stellar merger or collision), are also important. Consequently the available 
stellar evolution modules should ideally include both a very rapid but approx- 
imate code for applications where speed (enabling large numbers of stars) is 
paramount (e.g. when using the Barnes-Hut tree code to follow the stellar dy- 
namics) as well as a detailed (but much slower) structure and evolution code 
where accuracy is most important (for example when studying specific objects 
in relatively small but dense star clusters). 



Currently two stellar evolution modules are incorporated into MU SE. One, 



calle d EFT89, is based on fits to pre-calculated stellar evolution tracks (lEggleton et al 
19891 ). the other solves the set of coupled partial differential equations that 



describe stel lar structure and evolution following the Henyey code originally 
designed by lEggletonl (1197ll ). The latter code, called EVTwin is a fully im- 
plicit stellar evolution code that solves the stellar structure equations and 
the reaction-di ffusion equations for th e six major isotopes concurrently on an 
adaptive mesh (IGlebbeek et all 120081 ). EVTwin is designed to follow in detail 
the internal evolution of a star of arbitrary mass. The basic code, written in 
Fortran 77/90, operates on a single star — that is, the internal data structures 
(Fortran common blocks) describe just one evolving object. The EVTwin vari- 
ant describes a pair of stars, the components of a binary, and includes the 
possibility of mass transfer between them. A single star is modeled as a pri- 
mary with a distant, non-evolving secondary. The lower speed of EVTwin is 
inconvenient, but the much more flexible description of the physics allows for 
more realistic treatment of unconventional stars, such as collision products. 



Only two EVTwin functions — the "initialize" and "evolve" operations — are ex- 



posed to the MUSE environment. The F90 wrapper also is minimal, providing 
only data storage and the commands needed to swap stellar models in and 
out of EVTwin and to return specific pieces of information about the stored 
data. All other high-level control structures, including identification of stars 
and scheduling their evolution, is performed in a python layer that forms the 
outer shell of the EVTwin interface. The result is that the structure and logic of 
the original code is largely preserved, along with the expertise of its authors. 



2.1.3 Stellar collisions 

Physical interactions between stars are currently incorporated into MUSE 
by means of one of two simplified hydrodynamic solvers. The simpler of the 
two is based on the "sticky sphere" approximation, in which two objects 
merge when their separation becomes less than the sum of their effective 
radii. The connection between effective and actual radius is calibrated using 
more realistic SPH simulations of stel lar collisions. The seco nd is based on the 
make -me -a- star (MMAS) packagd_J ( Lombardi et al. L 20031) an d its extension 



make-me-a-massive-star0 (MMAMS, JGaburov etaD (J2008J )). MMA(M)S 



constructs a merged stellar model by sorting the fluid elements of the original 
stars by entropy or density, then recomputing their equilibrium configuration, 
using mass loss and shock heating data derived from SPH calculations. Ulti- 
mately, we envisage inclusion of a full SPH treatment of stellar collisions into 
the MUSE framework. 

MMAS (and MMAMS) can be combined with full stellar evolution models, 
as they process the internal stellar structure in a similar fashion to the stellar 
evolution codes. The sticky sphere approximation only works with parame- 
terized stellar evolution, as it does not require any knowledge of the internal 
stellar structure. 



2.1-4 Radiative transfer 

At this moment one example module for performing rudimentary radiative 
transfer calculations is incorporated in MUSE. The module uses a discrete 
grid of cells filled with gas or dust which is parameterized in a local density p 
and an opacity k, with which we calculate the optical depth (/ pndx) . A star, 
that may or may not be embedded in one of the grid cells emits L photons, 
each of which is traced through the medium until it is absorbed, escapes or 
lands in the camera. In each cloud cell or partial cell a photon has a finite 
probability that it is scattered or absorbed. This probability is calculated by 
solving the scattering function /, which depends on the angles and the Stokes 



See |http : //webpub . allegheny . edu/employee/ j / j alombar/mmas/ 



See http://modesta. science.uva.nl/ 



param eter. We adopt electron scattering for gas an d iHenyey fc Greenstein 



( 1194ll ) for dust scattering (see (jErcolano et al.l . 120051 ) for details). 



Since this module is in a rather experimental stage we only present two images 
of its working, rather than a more complete description in §3. In Fig. 2 we 
present the resul t of a clus t er sim ulation using 1024 stars which initially were 



distributed in a iPlummerl (119111 ) sphere with a virial radius of 1.32 pc and 
i n which the m asses of the stars ware selected randomly from the Salpeter 
( ISarpeterl . ll955l ) mass function between 1 and 100 M Q , totaling the cluster mass 



to about 750 M m . These parameters ware selected to mimic the Pleiades cluster 



±Oi 



( IPortegies Zwart et al.l . |2001| ). The cluster was scaled to virial equilibrium 
before we started its evolution. The cluster is evolved dynamically using the 
BHTree package and the EFT89 module is used for evolving the stars. 

We further assumed th at the cluster was embedded in a giant molecular cloud 



( jWilliams et al.l . l2000l ). The scattering parameters were set to simulate visible 
light. The gas and dust was distributed in a homogeneous cube with 5pc on 
each side which was divided into 1000 x 1000 x 100 grid cells with a density 
of 10 2 H2 particles/cm 3 . 

In Fig. 2 we present the central 5 pc of the cluster at an age of 120 Myr. The 
luminosity and position of the stars are observed from the z-axis, i.e. they are 
projected on the xy-plane. In the left panel we present the stellar luminosity 
color coded, and the size of the symbols reflects the distance from the observer, 
i.e., there it gives an indication of how much gas is between the star and the 
observer. The right image is generated using the MCRT module in MUSE and 
shows the photon-packages which were traced from the individual stars to the 
camera position. Each photon-package represents a multitude of photons. 



2.2 Units 



A notorious pitfall in combining scientific software is the failure to perform 
correct conversion of physical units between modules. In a highly modular 
environment such as MUSE, this is a significant concern. One approach to 
the problem could have been to insist on a standard set of units for modules 
incorporated into MUSE but this is neither practical nor in keeping with the 
MUSE philosophy. 

Instead, in the near future, we will provide a Units module in which infor- 
mation about the specific choice of units the conversion factors between them 
and certain useful physical constants are collected. When a module is added 
to MUSE, the programmer adds a declaration of the units it uses and expects. 
When several modules are imported into a MUSE experiment, the Units mod- 
ule then ensures that all external values passed to each module are properly 



10 





Fig. 2. Radiative transfer module applied to a small N = 1024 particle Plummer 
sphere. Left image shows the intrinsic stellar luminosity at an age of 120 Myr, the 
right image the image after applying the radiative transfer module for the cluster 
in a molecular cloud using a total of 10 photon-packages. The bar to the right of 
each frame indicates the logarithm of the luminosity of the star (left image) and the 
logarithm of the number of photons-packages that arrived in that particular pixel. 

converted. 

Naturally, the flexibility afforded by this approach also introduces some over- 
head. However, this very flexibility is MUSE's great advantage; it allows an 
experimenter to easily mix and match modules until the desired combination 
is found. At that point, the dependence on the Units module can be removed 
(if desired), and conversion of physical units performed by explicit code. This 
leads to more efficient interfaces between modules, while the correctness of the 
manual conversion can be checked at any time against the Units module. 



2. 3 Performance 



Large scale simulations, and in particular the multiscale and multiphysics 
simulations for which our framework is intended, require a large number of 
very different algorithms, many of which achieve their highest performance 
on a specific computer architecture. For example, the gravitational iV-body 
simulations are best performed on a GRAPE enabled PC, the hydrodynamical 
simulations may be efficiently accelerated using GPU hardware, while the 
trivially parallel simultaneous modeling of a thousand single stars is best done 
on a Beowulf cluster or grid computer. 

The top-level organization of where each module should run is managed using 
a resource broker, which is grid enabled (see § 2.4). We include a provision for 
individual packages to indicate the class of hardware on which they operate 
optimally. Some modules are individually parallelized using the MPI library, 



11 



whereas others (like stellar evolution) are handled in a master-slave fashion 
by the top-level manager. 



2.4 MUSE on the grid 



Due to the wide range in computational characteristics of the available mod- 
ules, we generally expect to run MUSE on a computational grid containing a 
number of specialized machines. In this way we reduce the run time by adopt- 
ing computers which are best suited to each module. For example, we might 
select a large GRAPE cluster in Tokyo for a direct iV-body calculation, while 
the stellar evolution is calculated on a Beowulf cluster in Amsterdam. Here 
we report on our preliminary grid interface, which allows us to use remote 
machines to distribute individual MUSE modules. 

The current interface uses the MUSE scheduler as the manager of grid jobs and 
replaces internal module calls with a job execution sequence. This is imple- 
mented with PyGlobus, an application programming interface to the Globus 
grid middleware written in Python. The execution sequence for each module 
consists of: 

• Write the state of a module, such as its initial conditions, to a file, 

• transfer the state file to the destination site 

• construct a grid job definition using the Globus resource specification lan- 
guage 

• submit the job to the grid; the grid job subsequently 

- reads the state file, 

- executes the specified MUSE module, 

- writes the new state of the module to a file, and 

- copies the state file back to the MUSE scheduler 

• then read the new state file and resume the simulation. 



The grid interface has been tested successfully using DAS-3|_i|, which is a five- 
cluster wide-area (in the Netherlands) distributed system designed by the 
Advanced School for Computing and Imaging (ASCI). We executed individ- 
ual invocations of stellar dynamics, stellar evolution, and stellar collisions on 
remote machines. 



sec http : //www . cs . vu . nl/das3/ 



12 



0.3 
< 0.0 



CD 

o 






10 
Time [N-body] 




15 



20 



Fig. 3. Time evolution of the distance between two black holes, each of which 
initially resides in the center of a "galaxy," made up of 32k particles, with total 
mass 100 times greater than the black hole mass. Initially the two galaxies were far 
apart. The curves indicate calculations with the direct integrator (PP), a tree code 
(TC), and using the hybrid met hod in MUSE (PP+TC). The units along the axes 
are dimensionless iV-body units ( Heggie fc Mathieul . Il986l ). 



3 MUSE examples 



3.1 Temporal decomposition of two N-body codes 



Here we demonstrate the possibility of changing the integration method within 
a MUSE application during runtime. We deployed two integrators to simulate 
the merging of two galaxies, each containing a central black hole. The final 
stages of such a merger, with two black holes orbiting one another, can only 
be integrated accurately using a direct method. Since this is computationally 
expensive, the early evolution of such a merger is generally ignored and these 
calculations are typically started some time during the merger process, for 
example when the two black holes form a hard bound pair inside the merged 
galaxy. 

These rather arbitrary starting conditions for the binary black hole merger 
problem can be improved by integrating the initial merger between the two 
galaxies. We use the BHTree code to reduce the computational cost of sim- 
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ulating this merger process. At a predetermined distance between the two 
black holes, when the tree code is unlikely to produce accurate results, the 
simulation is continued using the direct integration method for all particles. 
Overall this results in a considerable reduction in runtime while still allowing 
an accurate integration of the close black hole interaction. 

Fig. 3 shows the results of such a simulation. The initial conditions are two 
Plummer spheres, each consisting of 32k equal-mass particles. At the center 
of each "galaxy" we place a black hole with mass 1% that of the galaxy. 
The two stellar systems are subsequently set on a collision orbit, but at 
a fairly large separation. We use two stellar dynamics modules (see §2.1): 



BHTree (IBarnes fc Hutl . Il986l ). with a fixed shared time step, and phiGRAPE 
( iHarfst et all 120071 ). a direct method using hierarchical block time steps. Both 
modules are GRAPE-enabled and we make use of this to speed up the sim- 
ulation significantly. The calculation is performed three times, once using 
phiGRAPE alone, once using only BHTree, and once using the hybrid method. 
In the latter case the equations of motion are integrated using phiGRAPE if the 
two black holes are within ~ 0.55 N-body unito (indicated by the horizontal 
dashed line in Fig. 3); otherwise we use the tree code. Fig. 3 shows the time 
evolution of the distance between the two black holes. 



The integration with only the direct phiGRAPE integrator took about 250 min- 
utes, while the tree code took about 110 minutes. As expected, the relative 
error in the energy of the direct iV-body simulation (< 10 -6 ) is orders of mag- 
nitude smaller than the error in the tree code (~ 1%). The hybrid code took 
about 200 minutes to finish the simulation, with an energy error a factor of 
~ 10 better than that of the tree code. If we compare the time evolution of 
the black hole separation for the tree and the hybrid code we find that the 
hybrid code reproduces the results of the direct integration (assuming it to 
be the most "correct" solution) quite well. In summary, the hybrid method 
seems to be well suited to this kind of problem, as it produces more accurate 
results than the tree code alone and it is also faster than the direct code. The 
gain in performance is not very large (only ~ 20%) for the particular problem 
studied here, but as the compute time for the tree code scales with iVlogiV 
whereas the direct method scales with N 2 ; a better gain is to be expected for 
larger N. In addition, the MUSE implementation of the tree code is rather 
basic, and both its performance and its accuracy could be improved by using 
a variable block time step. 



see http : //en . wikipedia . org/wiki/Natural_units#N-body_units 
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Fig. 4. Evolution of a merger product formed by a collision between a 10M & star at 
the end of its main-sequence lifetime and a 7M & star of the same age (filled circles) , 
compared to the track of normal star of the same mass (15.7 M©) (triangles). A 
symbol is plotted every 5 x 10 4 yr. Both stars start their evolution at the left of the 
diagram (around T e g ~ 3 x 10 4 K). 

3.2 Stellar mergers in MUSE 



Hydrodynamic interactions such as collisions and mergers can strongly affect 
the overall energy budget of a dense stellar cluster and even alter the timing of 
important dynamical phases such as core collapse. Furthermore, stellar colli- 
sions and close encounters are believed to produce a number of non-canonical 
objects, including blue stragglers, low-mass X-ray binaries, recycled pulsars, 
double neutron star systems, cataclysmic variables and contact binaries. These 
stars and systems are among the most challenging to model and are also among 
the most interesting observational markers. Predicting their numbers, distribu- 
tions and other observable characteristics is essential for detailed comparisons 
with observations. 

When the stellar dynamics module in MUSE identifies a collision, the stellar 
evolution module provides details regarding the evolutionary state and struc- 
ture of the two colliding stars. This information is then passed on to the stellar 
collision module, which calculates the structure of the merger remnant, return- 
ing it to the stellar evolution module, which then continues its evolution. This 
detailed treatment of stellar mergers requires a stellar evolution module and 
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a collision treatment which resolve the internal structure of the stars; there is 
no point in using a sticky-sphere approach in combination with a Henyey-type 
stellar evolution code. 



Fig. 4 presents the evolutionary track of a test case in which EVTwin (lEggleton 



197ll ) (generally the more flexible TWIN code is used, which allows the evolution 
of two stars in a binary) was used to evolve the stars in our experiment. A 
10 M Q star near the end of its main-sequence collided with a 7M Q star of 
the same age. The collision itself was resolved using MMAMS. The evolution 
of the resulting collision product continued using EVTwin, which is presented 
as the solid curve in Fig. 4. For comparison we also plot (dashed curve) the 
evolutionary track of a star with the same mass as the merger product. The 
evolutionary tracks of the two stars are quite different, as are the timescales on 
which the stars evolve after the main sequence and through the giant branch. 

The normal star becomes brighter as it follows an ordinary main-sequence 
track, whereas the merged star fades dramatically as it re-establishes ther- 
mal equilibrium shortly after the collision. The initial evolution of the merger 
product is numerically difficult, as the code attempts to find an equilibrium 
evolutionary track, which is hard because the merger product has no hydrogen 
in its core. As a consequence, the star leaves the main-sequence almost directly 
after it establishes equilibrium, but since the core mass of the star is unusually 
small (comparable to that of a 10 M star at the terminal-age main sequence) 
it is under luminous compared to the normal star. The slight kink in the evolu- 
tionary track between log 10 T e g = 4.2 and 4.3 occurs when the merger product 
starts to burn helium in its core. The star crosses the Hertzsprung gap very 
slowly (in about 1 Myr), whereas the normal star crosses within a few 10,000 
years. This slow crossing is caused by the small core of the merger product, 
which first has to grow to a mass to be consistent with a ~ 15.7 M star 
before it can leave the Hertzsprung gap. The episode of extended Hertzsprung 
gap lifetime is interesting as observing an extended lifetime Hertzsprung gap 
star is much more likely than witnessing the actual collision. Observing a star 
on the Hertzsprung gap with a core too low in mass for its evolutionary phase 
would therefore provide indirect evidence for the collisional history of the star 
(regretfully one would probably require some stellar stethoscope to observe 
the stellar core in such a case). 



3.3 Hybrid N-body simulations with stellar evolution 



Dense star clusters move in the potential of a lower density background. For 
globular clusters this is the parent's galaxy halo, for open star clusters and 
dense young clusters it is the galactic disc, and nuclear star clusters are em- 
bedded in their galaxy's bulge. These high-density star clusters are preferably 
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modeled using precise and expensive direct-integration methods. For the rela- 
tively low density regimes, however, a less precise method would suffice; saving 
a substantial amount of compute time and allowing a much larger number of 
particles to simulate the low-density host environment. In §3.1 we described 
a temporal decomposition of a problem using a tree code 0(N\og(N)) and a 
direct iV-body method. Here we demonstrate a spatial domain decomposition 
using the same methods. 



The calculations perfo rmed in th i s § a re run via a MUSE module which 



is based on BRIDGE flFuiii et all 120071 ). Within BRIDGE a homogeneous 
and seamless transition between the different numerical domains is possible 
with a similar method as is used in the mixed- v ariable symplectic method 
( JKinoshita et al.l . Il99ll ; IWisdom fc Holmanl . Il99ll ) , in which the Hamiltonian 
is divided into two parts: an analytic Keplerian part and the individual inter- 
actions between particles. The latter are used to perturb the regular orbits. In 
our implementation the accurate direct method, used to integrate the high- 
density regions, is coupled to the much faster tree-code, which integrates the 
low-density part of the galaxies. The stars in the high-density regions are 
perturbed by the particles in the low-density environment. 



The method implemented in MUSE and presented here uses an accurate direct 
TV-body solver (like HermiteO) for the high density regime whereas the rest of 
the system is integrated using BHTree. In principle, the user is free to choose 
the integrator used in the accurate part of the integration; in our current 
implementation we adopt HermiteO, but the tree-code is currently petrified in 
the scheduler. This version of BRIDGE is currently not available in the public 
version of MUSE. 



To demonstrate the working of this hybrid scheme we simulate the evolution 
of a star cluster orbiting withi n a galaxy. The star cluster is represented by 
8192 particles with a Salpeter (jSalpeterl . Il955l ) mass fun ction betwee n 1 and 
100 M Q distributed according to a Wo = 10 King model (jKingl . Il966l ) density 
profile. This cluster is located at a distance of 16 pc from the center of the 
galaxy, with a velocity of 65 km s _1 in the transverse direction. The galaxy 
is represented by 10 6 equal-mass particles in a W = 3 King model density 
distribution. The stars in the star cluster are evolved using the MUSE stellar 
evolution module EFT89, the galaxy particles have all the same mass of 30 M 
and were not affected by stellar evolution. 



The cluster, as it evolves internally, spirals in towards the galactic center 
due to dynamical friction. While the cluster spirals in, it experiences core 
collapse. During this phase many stars are packed together in the dense cluster 
core and stars start to collide with each other in a collision runaway process 
(IPortegies Zwart et al.l . Il999l ). These collisions are handled internally in the 
direct part of BRIDGE. Throughout the core collapse of the cluster about a 
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Fig. 5. Results of the hybrid iV-body simulation using a 4th-order Hermite scheme 
for the particles integrated directly and a Barnes-Hut tree algorithm for the others. 
The top left panel: distance from the cluster to the Galactic center, top right: 
evolution of the cluster core radius, bottom left: bound cluster mass, bottom right: 
evolution of the mass of a few cluster stars that happen to experience collisions. 
The two crosses in the bottom right panel indicate the moment that two collision 
products coalesce with the runaway merger. 

dozen collisions occur with the same star, causing it to grow in mass to about 
700 M Q . A lthough the stellar evolution of s uch c ollision products is highly 



uncertain (jBelkus et all 120071 : ISuzuki et all 120071 ). we assume here that the 



massive star collapses to a black hole of intermediate mass. 



The direct part as well a s the tree-part of th e simulation was performed on a 
full 1 Tflops GRAPE-6 ( IMakino et all 120031 ) . whereas the tree-code was run 
on the host PC. The total CPU time for this simulation was about half a day, 
whereas without BRIDGE the run would have taken years to complete. The 
majority (~ 90%) of the compute time was spent in the tree code, integrat- 
ing the 10 6 particles in the simulated galaxy. (Note again that this fraction 
depends on the adopted models and the use of special-purpose hardware to 
accelerate the direct part of the integration.) Total energy was conserved to 
better than 2 x 10~ 4 (initial total energy was -0.25). 

The results of the simulation are presented in Fig. 5. Here we see how the 
cluster (slightly) spirals in, due to dynamical friction with the surrounding 
(tree-code) stars, toward the galactic center before dissolving at an age of 
about 4 Myr. By that time, however, the runaway collision has already resulted 



in a single massive star of more than 700 M . 



The description of stellar evolution adopted in this calculation is rather simple 
and does not incorporate realistic mass loss, and it is expected that the colli- 
sion runaway will have a mass of ~ 50M Q by the time it collapses to a black 
hole in a supernova expl osion. The supernova itself may be unusu ally bright 
(possibly like SN2006gy (IPortegies Zwart fc van den Heuvell. 120070) an d may 
collapse to a relatively massive black hole (IPortegies Zwart et all 12004 ) . Simi- 
lar collision r unaway results were obtained using d irect ./V-bo dy simulations us- 



ing s tarlab ( IPortegies Zwart fc McMillan!, 120021) andNBODY (IBaumgardt et al 
20(9) , and with Monte-Carlo (JGurkanelalJ, |2q3 iFreitag et all I2006J 1 stellar" 
dynamics simulations. 



3.4 Direct N-body dynamics with live stellar evolution 



While MUSE contains many self-contained dynamical modules, all of the stel- 
lar evolution modules described thus far have relied on simple analytical for- 
mulations or lookup formulae. Here we present a new simulation combining 
a dynamical integrator with a "live" stellar evolution code, following the de- 
tailed internal evolution of stars in real time as the dynamics unfolds. A similar 
approach has been undertaken by Ross Church, in his PhD thesis. The novel 
ingredient in thi s calculation is a MUSE interface onto the EVTwin stellar evo- 
lution program (lEggletonl . 120061 ) . modified for use within MUSE (see §3.2 for 
a description). 



In keeping with the philosophy of not rewriting existing working code, in 
incorporating EVTwin into MUSE, we have made minimal modifications to the 
program's internal structure. Instead, we wrap the program in a F90 data- 
management layer which maintains a copy of the stellar data for each star 
in the system. Advancing a system of stars simply entails copying the chosen 
star into the program and telling it to take a step. EVTwin proceeds with the 
task at hand, blissfully unaware that it is advancing different stellar models 
at every invocation (see §3.2). 

Figure 6 shows four snapshots during the evolution of a 1024-body system, car- 
ried out within MUSE using EVTwin and the simple shared-timestep hermit eO 
dynamical module. Initially the stars had a mass function dN/dm oc m~ 2,2 for 
0.25M Q < m < 15M , for a mean mass of 0.92M Q and were distributed ac- 
cording to a Plummer density profile with a dynamical time scale of 10 Myr, 
a value chosen mainly to illustrate the interplay between dynamics and stellar 
evolution. (The initial cluster half-mass radius was ~ 15 pc.) The initial half- 
mass relaxation time of the system was 377 Myr. The four frames show the 
state of the system at times 0, 200, 400, and 600 Myr, illustrating the early 
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Fig. 6. Evolution of a 1024-body cluster, computed using tni nermiteO and EVTwin 
MUSE modules. The four rows of images show the physical state of the cluster (left) 
and the cluster H-R diagram (right) at times (top to bottom) 0, 200, 400, and 600 
Myr. Colors reflect stellar temperature, and radii are scaled by the logarithm of the 
stellar radius. ^ 



mass segregation and subsequent expansion of the system as stars evolve and 
lose mass. 

The integrator was kept deliberately simple, using a softened gravitational po- 
tential to avoid the need for special treatment of close encounters, and there 
was no provision for stellar collisions and mergers. Both collisions and close 
encounters will be added to the simulation, and described in a future paper. 
We note that, although the hermit eO module is the least efficient member of 
the MUSE dynamical suite, nevertheless the CPU time taken by the simula- 
tion was roughly equally divided between the dynamical and stellar modules. 
Even without hardware acceleration (by GRAPE or GPU), a more efficient 
dynamical integrator (such as one of the individual block time step schemes 
already installed in MUSE) would run at least an order of magnitude faster, 
underscoring the need for careful load balancing when combining modules in 
a hybrid environment. 



4 Discussion 



The Multiscale Software Environment presented in this paper provides a di- 
verse and flexible framework for numerical studies of stellar systems. Now 
that the Noah's Ark milestone has been reached, one can ask what new chal- 
lenges MUSE has to offer. Many of the existing modules have been adapted 
for grid use and, as demonstrated in §2.4, MUSE can be used effectively to 
connect various computers around the world. However, there are currently a 
number of limitations in its use, and in its range of applications, which will be 
addressed in the future. Most of the current application modules remain un- 
suitable for large-scale scientific production simulations. The stellar dynamics 
codes do not yet efficiently deal with close binaries and multiples, although 
modules are under development, and external potentials, though relatively 
easy to implement, have not yet been incorporated. Binary evolution is not 
implemented, and the diagnostics available to study the output of the various 
modules remain quite limited. 

Many improvements can be made to the environment, and we expect to include 
many new modules, some similar to existing ones, others completely different 
in nature. The current framework has no method for simulating interstellar 
gas, although this would be an extremely valuable addition to the framework, 
enabling study of gas-rich star clusters, galaxy collisions, colliding-wind binary 
systems, etc. In addition, radiation transfer is currently not implemented, nor 
are radiative feedback mechanisms between stars and gas. Both would greatly 
increase the applicability base of the framework. However, both are likely to 
challenge the interface paradigm on which MUSE is based. 
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The current MUSE setup, in which the individual modules are largely decou- 
pled, has a number of attractive advantages over a model in which we allow 
direct memory access. The downside is that MUSE in its present form works 
efficiently only for systems in which the various scales are well separated. Com- 
munication between the various modules, even of the same type, is currently 
all done via the top interface layer. For small studies, this poses relatively 
little overhead, but for more extensive calculations, or those in which more 
detailed data must be shared, it is desirable to minimize this overhead. One 
way to achieve this would be by allowing direct data access between modules. 
However, for such cases, the unit conversion modules could not be used, and 
consistency in the units between the modules cannot be guaranteed. As a re- 
sult, each module would be required to maintain consistent units throughout, 
which may be hard to maintain and prone to bugs. In addition, the general 
problem of sharing data structures between modules written in different lan- 
guages, currently resolved by the use of the glue language, resurfaces. 
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